conv2d transpose
A Appendix for Learning Signal-Agnostic Manifolds of Neural Fields
In Section A.1 below, we provide details on training settings, as well as the underlying baseline We will add these details to the appendix of the paper. We next describe details necessary to reproduce each of other underlying empirical results. GEM performs significantly outperforms baselines. Table 1: Test CelebA-HQ reconstruction results of different methods evaluated across 3 different seeds. We provide source locations to download each of the datasets we used in the paper.
Learning Signal-Agnostic Manifolds of Neural Fields
Du, Yilun, Collins, Katherine M., Tenenbaum, Joshua B., Sitzmann, Vincent
Deep neural networks have been used widely to learn the latent structure of datasets, across modalities such as images, shapes, and audio signals. However, existing models are generally modality-dependent, requiring custom architectures and objectives to process different classes of signals. We leverage neural fields to capture the underlying structure in image, shape, audio and cross-modal audiovisual domains in a modality-independent manner. We cast our task as one of learning a manifold, where we aim to infer a low-dimensional, locally linear subspace in which our data resides. By enforcing coverage of the manifold, local linearity, and local isometry, our model -- dubbed GEM -- learns to capture the underlying structure of datasets across modalities. We can then travel along linear regions of our manifold to obtain perceptually consistent interpolations between samples, and can further use GEM to recover points on our manifold and glean not only diverse completions of input images, but cross-modal hallucinations of audio or image signals. Finally, we show that by walking across the underlying manifold of GEM, we may generate new samples in our signal domains. Code and additional results are available at https://yilundu.github.io/gem/.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Asia > Singapore (0.04)
- Asia > Middle East > Israel (0.04)
Conditional out-of-sample generation for unpaired data using trVAE
Lotfollahi, Mohammad, Naghipourfar, Mohsen, Theis, Fabian J., Wolf, F. Alexander
While generative models have shown great success in generating high-dimensional samples conditional on low-dimensional descriptors (learning e.g. stroke thickness in MNIST, hair color in CelebA, or speaker identity in Wavenet), their generation out-of-sample poses fundamental problems. The conditional variational autoencoder (CVAE) as a simple conditional generative model does not explicitly relate conditions during training and, hence, has no incentive of learning a compact joint distribution across conditions. We overcome this limitation by matching their distributions using maximum mean discrepancy (MMD) in the decoder layer that follows the bottleneck. This introduces a strong regularization both for reconstructing samples within the same condition and for transforming samples across conditions, resulting in much improved generalization. We refer to the architecture as \emph{transformer} VAE (trVAE). Benchmarking trVAE on high-dimensional image and tabular data, we demonstrate higher robustness and higher accuracy than existing approaches. In particular, we show qualitatively improved predictions for cellular perturbation response to treatment and disease based on high-dimensional single-cell gene expression data, by tackling previously problematic minority classes and multiple conditions. For generic tasks, we improve Pearson correlations of high-dimensional estimated means and variances with their ground truths from 0.89 to 0.97 and 0.75 to 0.87, respectively.
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.05)
- Asia > Middle East > Jordan (0.04)
- Oceania > Australia > New South Wales > Sydney (0.04)
- (3 more...)